Skip to content

Add pageCount to crawls and uploads and use in frontend for page counts #2315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 16, 2025

Conversation

tw4l
Copy link
Member

@tw4l tw4l commented Jan 16, 2025

Fixes #2257

This is a follow-up to the public collections work, which adds pages to the database for uploads. All crawls and uploads now have a pageCount field which is populated when the item is successfully added. A new migration is also added to populate the field for existing archived items that don't have it set yet.

OrgMetrics have also been modified to include crawlPageCount and uploadPageCount, and to include the total of both in pageCount, and all three included in the frontend org dashboard.

The frontend has been updated to use pageCount rather than stats.done wherever appropriate, meaning that in archived item lists and details we now have a consistent page count for both crawls and uploads.

Screenshots

Org Dashboard

Screenshot 2025-01-16 at 11 55 40 AM

Archived Items list

Screenshot 2025-01-16 at 11 56 36 AM

Crawl Detail

Screenshot 2025-01-16 at 11 57 08 AM

Upload Detail

Screenshot 2025-01-16 at 12 14 33 PM

Collection Archived Items List

Screenshot 2025-01-16 at 11 57 25 AM

Testing

(Requires having both backend and frontend deployed from this branch)

New functionality

  • Deploy this branch
  • Create new crawls and uploads and verify that page count appears correctly throughout the frontend for all new crawls and uploads

Migration

  • Deploy from latest main
  • Create some crawls and uploads
  • Change to this branch and re-deploy
  • Verify migration ran without errors in backend logs
  • Verify that page count has been populated successfully by checking archived items lists, crawl and upload detail pages, and dashboard to ensure there are no longer any missing page counts.

@tw4l tw4l requested review from SuaYoo, emma-sg and ikreymer January 16, 2025 17:10
Copy link
Member

@emma-sg emma-sg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good!

@ikreymer ikreymer merged commit 6797b41 into main Jan 16, 2025
27 checks passed
@ikreymer ikreymer deleted the issue-2257-archived-item-page-count branch January 16, 2025 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add page count to crawl model
3 participants